Mobile apps to reduce depressive symptoms and alcohol use in youth: A systematic review and meta‐analysis: A systematic review

Abstract Background Among youth, symptoms of depression, anxiety, and alcohol use are associated with considerable illness and disability. Youth face many personal and health system barriers in accessing mental health care. Mobile applications (apps) offer youth potentially accessible, scalable, and anonymous therapy and other support. Recent systematic reviews on apps to reduce mental health symptoms among youth have reported uncertain effectiveness, but analyses based on the type of app‐delivered therapy are limited. Objectives We conducted this systematic review with youth co‐researchers to ensure that this review addressed the questions that were most important to them. The objective of this review is to synthesize the best available evidence on the effectiveness of mobile apps for the reduction of depressive symptoms (depression, generalized anxiety, psychological distress) and alcohol use among youth. Search Methods We conducted electronic searches of the following bibliographic databases for studies published between January 1, 2008, and July 1, 2022: MEDLINE (via Ovid), Embase (via Ovid), PsycINFO (via Ovid), CINAHL (via EBSCOHost), and CENTRAL (via the Cochrane Library). The search used a combination of indexed terms, free text words, and MeSH headings. We manually screened the references of relevant systematic reviews and included randomized controlled trials (RCTs) for additional eligible studies, and contacted authors for full reports of identified trial registries or protocols. Selection Criteria We included RCTs conducted among youth aged 15–24 years from any setting. We did not exclude populations on the basis of gender, socioeconomic status, geographic location or other personal characteristics. We included studies which assessed the effectiveness of app‐delivered mental health support or therapy interventions that targeted the management of depressive disorders and/or alcohol use disorders. We excluded apps that targeted general wellness, apps which focused on prevention of psychological disorders and apps that targeted bipolar disorder, psychosis, post‐traumatic stress disorder, attention‐deficit hyperactivity disorder, substance use disorders (aside from alcohol), and sleep disorders. Eligible comparisons included usual care, no intervention, wait‐list control, alternative or controlled mobile applications. We included studies which reported outcomes on depressive symptoms, anxiety symptoms, alcohol use and psychological distress over any follow‐up period. Data Collection and Analysis We standardized the PICO definitions (population, intervention, comparison, and outcome) of each included study and grouped studies by the type of therapy or support offered by the app. Whenever app design and clinical homogeneity allowed, we meta‐analyzed outcomes using a random‐effects model. Outcome data measured using categorical scales were synthesized using odds ratios. Outcome data measured using continuous scales were synthesized as the standardized mean difference. We assessed the methodological quality of each included study using the Cochrane Risk of Bias 2.0 tool and we assessed certainty of the evidence using the GRADE approach. Main Results From 5280 unique citations, we included 36 RCTs published in 37 reports and conducted in 15 different countries (7984 participants). Among the 36 included trials, we assessed two with an overall low risk of bias, 8 trials with some concern regarding risk of bias, and 26 trials with a high risk of bias. Interventions varied in the type of therapy or supports offered. The most common intervention designs employed mindfulness training, cognitive behavioral therapy (CBT), or a combination of the two (mindfulness + CBT). However, other interventions also included self‐monitoring, medication reminders, cognitive bias modification or positive stimulation, dialectical behavioral therapy, gamified health promotion, or social skill building. Mindfulness apps led to short term improvements in depressive symptoms when compared to a withheld control (SMD = −0.36; 95% CI [−0.63, −0.10]; p = 0.007, n = 3 RCTs, GRADE: very low certainty) and when compared to an active control (SMD = −0.27; 95% CI [−0.53, −0.01]; p = 0.04, n = 2 RCTs, GRADE: very low). Apps delivering this type of support also significantly improved symptoms of anxiety when compared to a withheld control (SMD = −0.35; 95% CI [−0.60, −0.09]; p = 0.008, n = 3 RCTs, GRADE: very low) but not when compared to an active control (SMD = −0.24; 95% CI [−0.50, 0.02]; p = 0.07, n = 2 RCTs, GRADE: very low). Mindfulness apps showed improvements in psychological stress that approached statistical significance among participants receiving the mindfulness mobile apps compared to those in the withheld control (SMD = −0.27; 95% CI [−0.56, 0.03]; p = .07, n = 4 RCTs, GRADE: very low). CBT apps also led to short‐term improvements in depressive symptoms when compared to a withheld control (SMD = −0.40; 95% CI [−0.80, 0.01]; p = 0.05, n = 2 RCTs, GRADE: very low) and when compared to an active control (SMD = −0.59; 95% CI [−0.98, −0.19]; p = 0.003, n = 2 RCTs, GRADE: very low). CBT‐based apps also improved symptoms of anxiety compared to a withheld control (SMD = −0.51; 95% CI [−0.94, −0.09]; p = 0.02, n = 3 RCTs, GRADE: very low) but not when compared to an active control (SMD = −0.26; 95% CI [−1.11, 0.59]; p = 0.55, n = 3 RCTs, GRADE: very low). Apps which combined mindfulness and CBT did not significantly improve symptoms of depression (SMD = −0.20; 95% CI [−0.42, 0.02]; p = 0.07, n = 2 RCTs, GRADE: very low) or anxiety (SMD = −0.21; 95% CI [−0.49, 0.07]; p = 0.14, n = 2 RCTs, GRADE: very low). However, these apps did improve psychological distress (SMD = −0.43; 95% CI [−0.74, −0.12]; p = 0.006, n = 2 RCTs, GRADE: very low). The results of trials on apps to reduce alcohol use were inconsistent. We did not identify any harms associated with the use of apps to manage mental health concerns. All effectiveness results had a very low certainty of evidence rating using the GRADE approach, meaning that apps which deliver therapy or other mental health support may reduce symptoms of depression, anxiety and psychological distress but the evidence is very uncertain. Authors' Conclusions We reviewed evidence from 36 trials conducted among youth. According to our meta‐analyses, the evidence is very uncertain about the effect of apps on depression, anxiety, psychological distress, and alcohol use. Very few effects were interpreted to be of clinical importance. Most of the RCTs were small studies focusing on efficacy for youth at risk for depressive symptoms. Larger trials are needed to evaluate effectiveness and allow for further analysis of subgroup differences. Longer trials are also needed to better estimate the clinical importance of these apps over the long term.

post-traumatic stress disorder, attention-deficit hyperactivity disorder, substance use disorders (aside from alcohol), and sleep disorders.Eligible comparisons included usual care, no intervention, wait-list control, alternative or controlled mobile applications.We included studies which reported outcomes on depressive symptoms, anxiety symptoms, alcohol use and psychological distress over any follow-up period.
Data Collection and Analysis: We standardized the PICO definitions (population, intervention, comparison, and outcome) of each included study and grouped studies by the type of therapy or support offered by the app.Whenever app design and clinical homogeneity allowed, we meta-analyzed outcomes using a random-effects model.Outcome data measured using categorical scales were synthesized using odds ratios.Outcome data measured using continuous scales were synthesized as the standardized mean difference.We assessed the methodological quality of each included study using the Cochrane Risk of Bias 2.0 tool and we assessed certainty of the evidence using the GRADE approach.
Main Results: From 5280 unique citations, we included 36 RCTs published in 37 reports and conducted in 15 different countries (7984 participants).Among the 36 included trials, we assessed two with an overall low risk of bias, 8 trials with some concern regarding risk of bias, and 26 trials with a high risk of bias.Interventions varied in the type of therapy or supports offered.The most common intervention designs employed mindfulness training, cognitive behavioral therapy (CBT), or a combination of the two (mindfulness + CBT).However, other interventions also included self-monitoring, medication reminders, cognitive bias modification or positive stimulation, dialectical behavioral therapy, gamified health promotion, or social skill building.
However, these apps did improve psychological distress (SMD = −0.43;95% CI [−0.74, −0.12]; p = 0.006, n = 2 RCTs, GRADE: very low).The results of trials on apps to reduce alcohol use were inconsistent.We did not identify any harms associated with the use of apps to manage mental health concerns.All effectiveness results had a very low certainty of evidence rating using the GRADE approach, meaning that apps which deliver therapy or other mental health support may reduce symptoms of depression, anxiety and psychological distress but the evidence is very uncertain.
Authors' Conclusions: We reviewed evidence from 36 trials conducted among youth.According to our meta-analyses, the evidence is very uncertain about the effect of apps on depression, anxiety, psychological distress, and alcohol use.Very few effects were interpreted to be of clinical importance.Most of the RCTs were small studies focusing on efficacy for youth at risk for depressive symptoms.Larger trials are needed to evaluate effectiveness and allow for further analysis of subgroup differences.Longer trials are also needed to better estimate the clinical importance of these apps over the long term.Mobile applications (apps) available on smartphones can be used to manage mental health symptoms.Among young people, we found that apps which deliver mindfulness-based training or CBT may reduce symptoms of depression and anxiety, but the evidence is very uncertain.

| What is this review about?
Mental health conditions among young people aged 15-24 years are of increased concern.Young people face many limitations in accessing mental health care in the community, including a lack of access to and awareness among primary care physicians, teachers, and parents.
Smartphone applications (apps) offer young people the opportunity to manage their mental health symptoms and help them overcome barriers to accessing care.This review considers how apps are designed and whether they improve mental health symptoms (depression, anxiety, distress, alcohol use) among young people.We include young people as equal members of the research team to guide our approach.

| What is the aim of this review?
This review aims to summarize the evidence on smartphone apps and determine whether these apps improve symptoms of depression, anxiety, psychological distress and alcohol use among young people aged 15-24 years old.

| What are the main findings of this review?
We identified 36 trials that included 7984 young people from 15 different countries.We categorized these trials based on the type of support or therapy delivered by the app: (i) mindfulness-based apps, (ii) apps which delivered CBT, (iii) apps that delivered CBT and mindfulness together, and (iv) other app designs.These other designs included gamification, motivational strategies, and medication reminders.
We found that apps which delivered mindfulness training or CBT significantly improved short term symptoms of depression and anxiety, however this evidence is very uncertain.Apps which combined these two design elements reduced psychological distress.While many apps aimed to reduce alcohol use, the results were inconsistent and more research is needed.Importantly, we did not identify any harms in using these apps to manage mental health symptoms.

| What do the findings of this review mean?
Mobile apps that deliver CBT or mindfulness training may reduce symptoms of depression and anxiety, but the evidence is uncertain.We have very limited information about whether these apps reduce alcohol use.This uncertainty is due to the small sample sizes of the trials and concerns about how the trials were conducted.More research is needed to determine whether or not these designs are effective and should be recommended for young people.Future research should consider conducting trials over longer periods of time with more participants.

| How up-to-date is this review?
This review includes evidence published up to July 1, 2022.a. Two studies had high concerns of risk of bias with randomization process, deviation from intended interventions, missing outcome data, and measurement of the outcome, and one study had some concerns of risk of bias due to deviation from intended interventions and missing outcome data.b.The optimal information size was not reached.c.Three studies had high concerns of risk of bias with randomization process, deviation from intended interventions, missing outcome data, and measurement of the outcome, and one study had some concerns of risk of bias due to deviation from intended interventions and missing outcome data.

Summary of findings 4
CBT-based mobile apps compared to an active control (i.e., controlled mobile app) for improving mental health outcomes Explanations: a. Two studies had high concerns of risk of bias with randomization process, deviation from intended interventions, missing outcome data, and measurement of the outcome, and one study had some concerns of risk of bias due to deviation from intended interventions and missing outcome data.b.The optimal information size was not reached.c.The variation of effect sizes, large I 2 size, p-value smaller than 0.05, and limited overlap between 95% CI led to very serious inconsistency across the two studies.

Summary of findings 5
Combination therapy (CBT and mindfulness) compared to a withheld control (i.e., no intervention, waitlisting) for improving mental health outcomes Psychological stress no. of participants: 164 (2 RCTs) -SMD 0.43 SD lower (0.74 lower to 0.12 lower) ⨁◯◯◯ Very low b,c Explanations: a. Two studies had high concerns of risk of bias with randomization process, deviation from intended interventions, missing outcome data, and measurement of the outcome, and one study had some concerns of risk of bias due to deviation from intended interventions and missing outcome data.b.The optimal information size was not reached.c.One study had high concerns of risk of bias with randomization process, deviation from intended interventions, missing outcome data, and measurement of the outcome, and one study had some concerns of risk of bias due to deviation from intended interventions and missing outcome data.

| The problem, condition or issue
Symptoms of depression, anxiety, and behavioral disorders are associated with considerable illness and disability among youth and have worsened during the COVID-19 pandemic (WHO, 2021;Racine, 2021).Depressive symptoms among youth doubled in the first year of the pandemic, suggesting that 1 in 4 youth globally are experiencing clinically elevated depression symptoms (Racine, 2021).
As a consequence, an increase in demand for mental health care utilization among youth is expected.However, youth face many personal, societal, and health system-related barriers that reduce access to services for mental health and/or substance use disorders.
The COVID-19 pandemic created sustained disruption in society and schools.Public health restrictions caused isolation and reduced social and emotional support (Glover, 2020).Together, these factors have exacerbated depressive symptoms in a pre-existing youth global public health emergency (Racine, 2021).
Failing to address youth mental health conditions means conditions will often extend to adulthood, impairing both physical and mental health and limiting opportunities to lead fulfilling lives (WHO, 2021).
Even before the pandemic, 10%-20% of youth worldwide suffered from a mental health disorder, with half of all mental health conditions starting by 14 years of age (WHO, 2018).Physical, emotional, and social changes such as exposure to poverty, abuse, or violence can increase their risk of depressive disorders and alcohol use (Saluja, 2004 , 2023).Despite the existence of these therapies, they often exist behind significant barriers to accessing care.
Youth with depressive disorders or substance use disorders rarely access mental health or primary care services due to personal and systemic barriers (Ross, 2018).The social stigma of asking for help, limitations in community services, and lack of awareness amongst family physicians, teachers, and parents often leave youth alone and untreated (Cash, 2018;Malla, 2019).However, failure to receive proper mental healthcare has grave consequences for youth, including disabling depression, suicide, expulsion in early education, later unemployment and higher incarceration rates (Cash, 2018).
Furthermore, not only do youth not learn about the etiology of their mental health symptoms, many conditions may persist into adulthood although studies suggest symptoms of depressive disorders are more easily treated and prevented during adolescence (Pedrelli, 2015;NIMH, 2021).
Given the stigma associated with accessing mental health services, mobile applications (apps) offer a unique opportunity to provide mental health support in a non-judgemental, readily accessible, and large-scale manner.With greater than 8.93 million apps available for mobile download (Koetsier, 2020), applying a digital treatment approach to depressive disorder and alcohol use may offer a means to help manage the accentuated youth mental health needs.
Moreover, youth are uniquely positioned to benefit from digital health resources given their increasing acceptance and usage of technology (Hyden, 2011).Mobile apps may also offer a solution to youth who would otherwise lack the independence to access treatment on their own or would prefer the discretion and anonymity that digital platforms can offer (Grist, 2017).
Apps are available in many designs with different features, but they often aren't "reinventing the wheel"they build on decades of psychological research on the foundational principles to mental health treatment.For example, the delivery of CBT builds on decades of empirical work understanding the psychological construct that individuals' interpretations of situations influence their reaction (emotional, behavioral, physiological).However, one important continuing challenge for researchers and clinicians is to develop ways to deliver quality CBT treatment to the individuals who need it most.This involves both adapting treatment for diverse populations and creating effective and efficient treatment delivery models, including the expansion of digital methods of delivery (Beck, 2021).
Indeed, examining the effectiveness of digital mental health interventions is a current research priority (Hollis, 2018).With so many apps to choose from, youth, clinicians, parents and other trusted support persons need more information about the type of app-delivered therapy that, if any, work best for addressing their mental health concerns.Recent systematic reviews on standalone mobile apps for mental illness symptoms report promising results (Wang, 2018;Lecomte, 2020;Miralles, 2020;Leech, 2021;Weisel, 2019), but analyses based on the type of therapy of supports they deliver are limited and certainty of evidence remains low.

| Description of the condition
The COVID-19 pandemic had a grave effect on depressive symptoms and alcohol use in adolescents and youth.Depressive symptoms in youth doubled in the first year of the COVID-19 pandemic and pooled estimates suggest that 1 in 4 youth globally are experiencing clinically elevated depression symptoms and that an increase in mental health care utilization is expected (Racine, 2021).
Heavy use of alcohol has increased 43% in youth, comparing 2012 with 2020 data (Pollard, 2020).The COVID-19 pandemic created sustained disruption in society, schools, and universities, including increasing poverty and discrimination and public health restrictions have caused isolation and reduced social and emotional support (Glover, 2020).Together, these factors make depressive symptoms in youth a global public health emergency (Racine, 2021).and post-treatment condition management (Chandrashekar, 2018).

| Description of the intervention
Self-management apps, for example, allow the user to catalog symptoms, set up medication reminders, provide tools or help develop skills such as mindfulness for managing stress, anxiety, or sleep problems.Other apps may help improve thinking skills, illness management, and connect patients with online support groups or healthcare professionals.
Mobile apps also provide users with the opportunity to increase their awareness about their symptoms or daily habits which contribute to depressive disorders.With increased awareness comes more opportunities to intervene (Prentice, 2014).Furthermore, for youth in particular, there is growing evidence addressing depressive disorders and alcohol use may be a positive step toward developing social and emotional capabilities that could influence educational attainment, employment, and health (McNeil, 2012).Multipurpose or treatment apps may also reduce symptoms and the need for inperson appointments with clinicians, alleviating their workloads (Van Ameringen, 2017).

| How the intervention might work
Mobile apps can be used almost anytime and anywhere, making them ideal for youth facing assessment or treatment barriers.With the cost of mobile apps being significantly less than that of traditional care, they could also provide care to where help may not be affordable or available.Additionally, given the stigma surrounding mental health, mobile apps offer a unique anonymous and non-judgemental first step for those who may have avoided services for mental health concerns in the past.Anonymity may also be beneficial for youth who would otherwise lack the autonomy to access treatment on their own or simply do not know where to start.
The precise mechanisms through which apps can assist youth in addressing mental health concerns depends on the design of the app itselfthat is, the type of psychological theory, "treatment type" or supports offered to youth by the app.Given the current recommendations for CBT and its derivative treatment types for youth, it is sensible that apps could also apply these treatment modalities through a digital approach.Figure 1 summarizes the potential conceptual pathways for these digitally provided treatments for CBT and mindfulness therapy (Figure 1).We acknowledge that other types of therapies, or combinations of these therapies, could be available through apps.We provide this figure as an example of the potential conceptual pathways, but recognize that many other pathways could exist and are in need of further exploration.
In summary, youth experiencing common mental health symptoms may seek out accessible and non-judgemental apps to address these concerns.An app delivering CBT, for example, hypothesizes that youth's emotions and behaviors are influenced by their perceptions of events (Fenn, 2013).The app can assist youth to teach them to be their own therapist, by helping them to understand their current ways of thinking and behaving, and by equipping them with the tools to change their maladaptive cognitive and behavioral patterns (Fenn, 2013).Apps can apply a wide range of psychological describes mindfulness as the ability to reperceive (also known as decentering).Decentering allows youth to stand back and witness the drama of their life without being personally immersed and engaging with it (Shapiro, 2006).Therefore, decentering through mindfulness allows youth to clarify their values and thereby better meet their needs and interests, which produces more guided attention and facilitates the achievement of various goals that may bring about greater health and wellbeing (Halliday, 2018).
However, as with any health intervention or advancement in technology, there is the potential for harm.Mobile mental health apps are not generally distributed through healthcare providers or settings (Leigh, 2015), rather, users are generally left to evaluate their app choices based on user ratings (Chiauzzi, 2019).Unfortunately, these ratings are based on subjective experiences of users, typically from a usability or visual standpoint, and are not always reflective of an app's quality in terms of improving health outcomes (Bidargaddi, 2017) or adherence to best-practice treatment guidelines (Nicholas, 2017).In fact, there are no industry-wide standards to help consumers know if an app has been proven to be effective.As such, there is concern that if a mobile app promises more than it delivers, and provides no clinical oversight, consumers may turn away from other, more effective mental health interventions.Data security for mental health apps is another widespread concern (Powell, 2014).
These apps deal with very sensitive personal information; mobile app product makers must be able to guarantee privacy.Furthermore, many available mental health mobile apps target specific disorders and label their users with a diagnosis which can be harmful and stigmatizing (Moses, 2009).Finally, there is increasing concern for digital dependency, especially among youth.The relationship between app exposure and health in adolescents may follow an "inverted U" pattern, that is, very high exposure and very low exposure might both be associated with poorer mental health outcomes than moderate amounts of usage (Christakis, 2019).

| Why it is important to do this review
Standalone mobile apps could become scalable interventions for youth facing mental health symptoms and alcohol use around the world.The design and functionality of mobile apps continues to improve (Lecomte, 2020) and more than 8.93 million apps have emerged on the public market (Koetsier, 2020).At the same time, the COVID-19 pandemic has created isolation, lockdowns and unemployment has increased depressive disorders and alcohol use in youth (Racine, 2021).Systematic reviews on specific app design and approaches are very rare and much needed to better understand the emerging app evidence base.
The prevalence of youth experiencing depressive symptoms and alcohol use disorders has more than doubled during the COVID-19 pandemic (Racine, 2021).Despite this high prevalence, youth with mental health and/or substance use disorders rarely access mental health services due to personal and systemic barriers (Ross, 2018).As adolescents are still developing, this period in their lives is critical for their physical and mental well-being (Kessler, 2007).At the same time, accessible services to match the unique needs of youth populations are limited (Malla, 2018).Health inequities put populations who are already socially disadvantaged (e.g., by virtue of being poor, female, Indigenous or members of a disenfranchised racial, ethnic, or religious group) at further disadvantage with respect to their health (Braveman, 2003).Recent user evidence suggests that mobile apps are most commonly designed for, and used by, white youth, predominantly women, and there is a need to extend their development for culturally and linguistically diverse populations.
Community clinicians need to know more about the effectiveness of mobile apps for depressive disorders and alcohol in youth.
Clinicians find it difficult to recommend apps since ratings and reviews are not standardized or available as reliable sources (Marshall, 2020).In addition, there may be future intervention opportunities such as combining medication, existing talk therapies and mobile apps to complement treatment programs (Dellabella, 2018;Truschel, 2021).
Five systematic reviews have been published on the use of standalone mobile apps for common mental illness symptoms and alcohol use in the past few years (Lecomte, 2020;Leech, 2021;Miralles, 2020;Wang, 2018;Weisel, 2019).Lecomte (2020) presented a meta review of 7 meta-analyses and showed results focusing on apps for depressive symptoms and anxiety were of higher quality and showed small but favorable effect sizes, while apps focusing on stress, emotional health had no significant effects.
Only Leech (2021) reviewed apps with a focus on adolescents and youth and they reported similarly some favorable small effects on depression and anxiety and importantly, they did not find age to be a significant effect size modifier.Leech (2021) (Grist, 2017).The heterogeneity in the definition of the intervention, dosage, duration of interventions, classification of diseases, and selection of populations may lead to a distorted portrait of the added value of mobile apps.Independent evaluation of mobile apps is also lacking (Marshall, 2020).
Several countries, such as the England and Australia have led in the implementation of nation-wide strategies to prioritize the use of digital technologies in healthcare with policies such as Future in Mind, Five Year Forward, and the UK government's new health education system as of 2020 (Department, 2019;NHS, 2015).
Furthermore, the World Health Organization's "QualityRights" pushes for access to mental health services for people around the world (WHO, 2021).Lastly, the World Bank Group launched Mental Health for Sustainable Development which promotes change in policy to bring mental health at the center of global health and development agendas (The Lancet, 2014;Dutta, 2020).Although the COVID-19 pandemic has slowed or interrupted these programs (CAMH, 2020), there is a need to renew efforts to support youth suffering with depressive and anxiety symptoms and alcohol use.

| OBJECTIVES
The objective of this review is to synthesize the best available evidence on the effectiveness of standalone mobile apps for the reduction of depressive symptoms (depression, generalized anxiety, psychological distress) and alcohol use among youth.

| Stakeholder engagement in this review
5.2 | Criteria for considering studies for this review

| Types of studies
The methods outlined below are further described in our published protocol (Magwood, 2022).We developed these methods by engaging youth with lived experience of mental health concerns or substance use.Youth were engaged as co-researchers throughout the review and were critical in the development of the research question, selection of outcomes, and determining the analytic approach.Youth participated in article selection, data extraction, interpretation of findings, and writing the final report.Youth who met authorship criteria are included as co-authors in this report.
We included randomized control trials (RCTs), including any cluster-randomized trials or crossover trials.We included only RCTs because randomization is the only way to prevent systematic differences between baseline characteristics of participants in different intervention groups in terms of both known and unknown confounders, and claims about cause and effect can be based on their findings with far more confidence than almost any other type of study (McKenzie, 2023).Based on initial scoping exercises and review of other published systematic reviews, we determined that sufficient evidence from RCTs was available on our topic of interest.

| Types of participants
We included studies focusing on youth aged 15-24 years with heightened symptoms of depressive disorders, including depression and anxiety spectrum disorders or problematic alcohol use.We focused our efforts on depression, generalized anxiety and alcohol use, and excluded studies that focused specifically on bipolar disorder, psychotic disorders, eating disorders, and other substance use disorders aside from alcohol.
Our age range was selected to coincide with the United Nations definition of youth (UN, n.d.).We did not exclude populations on the basis of gender, socioeconomic status, geographic location or other personal characteristics.If we identified studies with participants with variable age ranges (e.g., high school students aged 12-15 years and young adults aged 25-30 years) we included the study if (1) the mean age of the study participants was between 15 and 24 years, or (2) agedisaggregated data were available from the authors.

| Types of interventions
This systematic review focused on mobile apps that targeted the management of depressive disorders and/or alcohol use disorders.
Mental health mobile apps are apps that are available on a mobile device (smartphone, tablet, or phablet), which can be used by the participant without direct intervention from the health care provider.We excluded mental health apps that targeted wellness and prevention of psychological disorders and apps that targeted bipolar disorder, psychosis, post-traumatic stress disorder, attention-deficit hyperactivity disorder, substance use disorders (aside from alcohol), and sleep disorders that would vary in design and functionality.
We focused on mobile apps for management of depressive disorders and alcohol use that delivered interventions for youth at any time or place in the absence of a direct interaction with a healthcare provider or specialist.Therefore, we excluded any platforms or social media approach that focussed on prevention or uniquely connected patients with their healthcare provider via video-conferencing or voice calls, such as telemedicine.
Eligible comparisons included usual care, no intervention, waitlist control, alternative or controlled mobile applications.
Interventions not eligible for inclusion included: • Mobile phones for sending Short Message Service (SMS) messages • Web-based interventions • Interventions delivered through social media platforms such as Facebook, Twitter and Instagram • Interventions delivered through email • Telemedicine services that only provide direct interaction between patients and their remote healthcare provider via teleconferencing

Primary outcomes
We included studies which reported on the following outcomes: • Symptoms of depression • Symptoms of anxiety These outcomes were selected based on consultations with youth aged 15-24.Youth were provided with a list of possible outcomes, identified by scoping the available published literature.A convenience sample of youth were asked to rank outcomes of interest and then came to consensus about the most important outcomes for analysis.To be eligible for inclusion, studies could measure at least one of these outcomes using validated mental health (depression, anxiety, psychological distress) or alcohol use scales.
However, we considered self-report data if no validated measures were available.Examples of these validated scales include: • Depression:

Duration of follow-up
All durations of follow-up will be eligible for inclusion.We will categorize follow-up as short, medium or long term based on previous literature as follows: • Short term: 3 months or less • Medium term: Between 6 and 12 months follow-up • Long term: Longer than 12 months follow up

Types of settings
We included studies from any setting.

| Electronic searches
We developed a search strategy and had it peer-reviewed by a librarian with expertise in systematic review searches.We searched the following bibliographic databases: MEDLINE (via Ovid), Embase We also contacted the authors of RCTs with incomplete data and trial registrations to determine if complete published reports were available (Supporting Information: Appendix 2).

| Searching other resources
Not applicable.

| Selection of studies
Two review authors independently assessed all potential records identified by our search strategy.They screened titles and abstracts and then the full text of relevant records.Before initiating the screening process, two reviewers undertook a screening exercise with a random sample of n = 100 records to ensure inter-reviewer agreement and calibrate the screening strategy through training, as necessary.Inter-rater agreement was measured using the kappa coefficient (Cohen, 1960), and a kappa statistic of 0.81 or higher was was set as the cut-off for adequate inter-reviewer screening reliability (Landis, 1977).The kappa statistic of our initial screening exercise was 0.70, which required calibrating the screening process through holding recurrent training sessions for our team members (n = 12) and reaching consensus on records for which disagreements arose.Thereafter, we resolved any disagreements through discussion or, if required, we consulted a third review author.

| Data extraction and management
We

| Assessment of risk of bias in included studies
We assessed the risk of bias of all included studies using the Cochrane Risk of Bias 2.0 (RoB 2.0) tool (Sterne, 2019).Two reviewers independently conducted Risk of Bias assessments and disagreements were adjudicated by a third reviewer, independently.
We followed the detailed guidance for using the Cochrane Risk of Bias tool available from Cochrane (https://methods.cochrane.org/risk-bias-2) and from the developers via the Risk of Bias tools website (www.riskofbias.info).These guidance documents provided clear algorithms for determining whether assessments should be rated as "low risk," "some concerns" and "high risk."

| Measures of treatment effect
We categorized effectiveness data based on intervention purpose and selected data based on our specified outcomes.If a study reported multiple measures for the same outcome, we selected the outcome that was most common across studies (e.g., using the same mental health scale) to allow for pooling in meta-analyses.Outcome data measured using categorical scales were synthesized using odds ratios.Outcome data measured using continuous scales were synthesized as the standardized mean difference at follow-up.All effect estimates were accompanied by measures of variance (standard deviations) and statistical significance estimates (confidence intervals or p-values) when possible at the a 0.05 level of significance, unless stated otherwise.We calculated measures of variance and statistical significance whenever the authors do not provide them (Borenstein, 2009).

| Unit of analysis issues
We assessed the unit of analysis of all the trials to individuals.

| Dealing with missing data
We contacted the authors of the included RCTs to identify any additional data.When data was not reported in a manner that allowed synthesis, we highlighted in our results narratively and as reported by authors.

| Assessment of heterogeneity
To facilitate the process of assessing for clinical heterogeneity between studies, we standardized the PICO definitions (population, intervention, comparison, and outcome) of each included study with definitions that align with the scope of our review (McKenzie, 2023).
One reviewer followed this standardization process for each included study and a second reviewer verified these decisions.Any discrepancies were resolved by reaching a consensus or consulting a third reviewer.The categories we will use to standardize the PICO definitions are presented below.
Studies that measured the same outcome (e.g., severity of depression symptoms) were then assessed for clinical heterogeneity by examining whether their standardized PICO definitions aligned with each other.We categorized interventions as follows: (1) Apps which deliver CBT, (2) apps which deliver mindfulness therapy, (3) apps which deliver a combination of CBT and mindfulness, and apps which deliver other therapy designs.We also identified two types of comparison groups: The first is withheld comparisons where a participant does not receive any intervention (e.g., they may continue with "usual care" or are waitlisted to receive the intervention).In this situation, the participant is not blinded to their intervention assignment.In contrast, "active controls" (or "controlled interventions") maintain blinding of the participant.In these circumstances, participants are provided with a placebo app which does not deliver a therapy and this is compared to an app which does deliver therapy.In this situation, the findings from these trials offer a lower risk of bias estimate on whether or not apps can deliver a mental health therapy to improve youth mental health.Furthermore, we assessed the statistical heterogeneity of metaanalyzed results by examining the I 2 and χ 2 estimates calculated using RevMan 5.3.When we detected significant statistical heterogeneity (Higgins, 2003) we examined and reported its source by conducting sensitivity analyses of studies included in the meta-analysis.
Study populations were not large enough to allow us to investigate heterogeneity with subgroup analysis considering gender and socioeconomic status.

| Assessment of reporting biases
We used the Cochrane ROB 2.0 criteria related to selective outcome reporting to assess for reporting bias.We reported all protocols from our search that did not have a published study.We aimed to assess publication bias with funnel plots with all meta analyses with n > 10 studies, but no meta-analyses met these criteria (Boutron, 2022).

| Data synthesis
Whenever clinical homogeneity allowed, we meta-analyzed results and created forest plots using RevMan 5.4 software using a randomeffects model (Borenstein, 2009).We chose a random-effects model to account for the inherent heterogeneity between the characteristics of study cohorts, intervention design, and implementation context.Results that were not pooled together were reported narratively (Popay, 2006;Campbell, 2020).All results are accompanied by a GRADE certainty assessment.Furthermore, we ascertained whether our pooled results reached a minimal clinically-important thresholds of outcome measures using evidence from the scientific literature on outcome measurement tools (Kounali, 2020;Button, 2015;Kroenke, 2001), whenever possible, and interpreting the results with our youth patient partners with lived experience (Weinfurt, 2019).

| Subgroup analysis and investigation of heterogeneity
We were unable to conduct any subgroup analysis due to insufficient data.

| Sensitivity analysis
We conducted sensitivity analyses when necessary, exploring the clinical variability of studies that contributed to heterogeneity.

| Summary of findings and assessment of the certainty of the evidence
We assessed the certainty of the evidence using the GRADE approach.

| Results of the search
Our search identified 5280 citations.After screening titles and abstracts, 393 citations were retained for full text review.Of these, 355 citations did not meet eligibility criteria and 36 trials reported in 37 publications were included in the final review (Figure 2).

| Included studies
We included 36 RCTs (7984 participants) published in English and conducted in 15 different countries, as follows: USA (n = 13), Australia (n = 5), New Zealand (n = 3), China (n = 2), South Korea (n = 2), UK (n = 2) and one RCT from each of Canada, Finland, Germany, Iceland, Ireland, Japan, the Netherlands, Spain, Sweden, and Taiwan.The majority of included studies were conducted among male and female students in university settings.Only one study focused exclusively on women (Levin, 2020).Three studies included participants of diverse gender identity (Cordova, 2020;Bruhns, 2021;Thabrew, 2022) and one study reported a majority of participants identifying as LGBTQI (Torok, 2022).One study focused specifically on youth experiencing homelessness (Thompson, 2020).Interventions varied in mental health therapy design and functionality.The most common therapy designs employed mindfulness, CBT, or a combination of the two.However, other interventions also included self-monitoring (Reid, 2011), medication reminders (Hammonds, 2015), cognitive bias modification or positive stimulation (Teng, 2019;Visser, 2020;Kageyama, 2021), DBT (Torok, 2022), gamified health promotion (Egilsson, 2021), or social skill building to address loneliness (Bruehlman-Senecal, 2020).A summary of the characteristics of all included studies is available in Table 1.

| Excluded studies
We excluded 393 studies after full-text assessment.Among these, 230 citations were identified as study protocols or trial registrations.
We excluded 40 studies because the population was not youth aged 15-24 and 27 studies due to wrong intervention.Twenty-four conference abstracts were excluded, followed by 13 studies which were not RCTs and an additional 11 not reporting relevant mental health outcomes.Finally, we excluded 9 duplicates and 2 dissertations (Figure 2).

| Risk of bias in included studies
Among the 36 included trials, we assessed two with an overall low risk of bias, 8 trials with some concern regarding risk of bias, and 5 domains are summarized in Figure 3, and the individual study assessments are presented in Figure 4.

| Bias arising from the randomization process
Domain 1 examined risk of bias arising from the randomization process.Studies were classified as low risk if the allocation sequence was random, concealed to participants and that there were no baseline differences between groups to suggest a problem with the randomization process.Nine trials (25%) were rated as low risk of bias, 23 (64%) trials had some concerns, and the remaining 4 (11%) had high risk of bias.The most common issue across studies was a lack of information about allocation concealment.

| Bias due to deviations from intended interventions
Domain 2 evaluated risk of bias due to deviations from the intended interventions.This was assessed by examining participant's and assessor's blindness to the intervention and consequently if any deviations from the intended intervention and its potential effect on outcome.Domain 2 also evaluated the appropriateness the analysis used to estimate the effect of assignment to intervention.Six trials (16%) were rated as low risk of bias, 15 trials (42%) had some concerns, and 15 trials (42%) were rated as high risk of bias.In most studies, it was not possible to blind the participants nor the outcome assessor to the intervention.

| Bias due to missing outcome data
Domain 3 examined risk of bias due to missing outcome data.We evaluated whether studies reported that all outcome data was available for nearly all randomized participants, and if not what effect this could have on biassing the results and affecting the true values.
Seventeen trials (47%) had a low risk of bias, 6 trials (17%) had some concerns, and 13 trials (36%) had a high risk of bias regarding missing outcome data.The primary concern in these studies was a high rate of attrition and/or unequal attrition across intervention and control groups.

| Bias in measurement of the outcome
Domain 4 evaluated risk of bias in measurement of outcomes.To assess Domain 4, studies were evaluated on the appropriateness of their method to measure outcomes and whether the outcome assessors were blinded.Further, there was ascertainment of whether measurement of the outcomes could have differed between groups, T A B L E 1 Characteristics of included studies.as well as if the outcomes could have been influenced by the assessor's knowledge of intervention received.We assessed 13 trials (36%) as low risk of bias, 5 trials (14%) with some concerns, and 18 trials (50%) with high risk of bias.Most often, outcome assessors and participants as self-assessors were not blinded to condition assignment and their knowledge may have influenced the assessment of outcomes.
6.2.5 | Bias in selection of the reported result Domain 5 focused on risk of bias in selection of the reported result.
This was evaluated by whether a study's data was analyzed in accordance with a pre-specified analysis plan and the likelihood that the numerical result being assessed was selected out of bias.Ratings for this domain were predominantly based on examination of a trial protocol or statistical analysis plan.Twenty-seven trials (75%) had a low risk of bias and 9 trials (25%) had some concerns regarding risk of bias.No trials were assessed as having high risk of bias for this domain.
Two studies provided youth with CBT-based mobile apps and compared their impact to withheld controls (McCloud, 2020;Thabrew, 2022).A meta-analysis of the two studies (Analysis 1. Thabrew, 2022) (Figure 6).There was a moderate degree of statistical heterogeneity among pooled results (I 2 = 45%; χ 2 p = 0.18; τ 2 = 0.04) but the small number of studies in the meta-analysis (n = 2) prevented conducting sensitivity analyses.GRADE certainty of evidence was very low.
Two additional studies used both CBT and mindfulness as framework for their intervention design (Bruhns, 2021;Ponzo, 2020).
Four additional studies (Five publications) compared the severity of depressive symptoms among participants receiving mobile apps to those receiving active control (Kauer, 2012;Reid, 2011;Teng, 2019;Torok, 2022;Visser, 2020), but the clinical heterogeneity arising from the variability of mental health therapy designs prevented pooling their results.The mobile apps used different techniques to manage depressive symptoms, including DBT (Torok, 2022), self-monitoring of mental health symptoms (Kauer, 2012;Reid, 2011), or cognitive bias modification or positive stimulation (Teng, 2019;Visser, 2020).
Three additional studies provided youth participants with CBTbased mobile apps and compared their impact to withheld controls (McCloud, 2020;Thabrew, 2022;Yang, 2022).A meta-analysis of the three studies (Analysis 2.2) found statistically significant improvements in anxiety symptoms among participants in the intervention group compared to those in the control group in the short term (Pooled SMD = −0.51;95% CI: −0.94, −0.09; p = 0.02) (McCloud, 2020;Thabrew, 2022;Yang, 2022) (Figure 11).There was a moderate degree of statistical heterogeneity (I 2 = 51%; χ 2 p = 0.13; τ 2 = 0.07) and GRADE certainty of evidence was very low.As such we conducted a sensitivity analysis to explore the source of this heterogeneity, by visually examining the forest plot and removing data from the study with a confidence interval not overlapping with those of the remaining studies (Yang, 2022).Our sensitivity analysis Mobile apps compared to active controls (i.e., controlled or sham mobile app).Two studies compared mindfulness-based apps to active controls (Flett, 2019;Sun, 2022).A meta-analysis of the two studies (Analysis 2.4) found improvements in anxiety symptoms that approached statistical significance among participants in the F I G U R E 4 Risk of bias summary (RoB 2.0).intervention group compared to those in the control group in the short term (Pooled SMD = −0.24;95% CI: −0.50, 0.02; p = 0.07) (Flett, 2019;Sun, 2022) (Figure 14).Pooled results were statistically homogeneous (I 2 = 0%; χ 2 p = 0.44; τ 2 = 0.00) and GRADE certainty of evidence was very low.
Three additional studies provided youth participants with CBT-based mobile apps and compared their impact to active controls (Fitzpatrick, 2017;Hur, 2018;Liu, 2022).A meta-analysis of two studies (Analysis 2.5) found that improvements in anxiety symptoms did not reach statistical significance among participants in the intervention group compared to those in the control group in the short term (Pooled SMD = −0.26;95% CI: −1.11, 0.59; p = 0.55) (Fitzpatrick, 2017;Hur, 2018) (Figure 15).Results were largely statistically heterogeneous (I 2 = 76%; χ 2 p = 0.04; τ 2 = 0.29), but the small number of included studies in the meta-analysis (n = 2) prevented conducting a sensitivity analysis.GRADE certainty of evidence was very low.An additional study compared anxiety symptoms in the medium term (i.e., 4 months) and found that F I G U R E 13 Analysis 2.3 for the outcome of anxiety; Combination therapy (CBT and mindfulness) mobile apps versus withheld controls.participants using a chatbot designed on the principles of CBT showed a statistically significant improvements in anxiety symptoms compared to those using bibliotherapy (Evidence not sufficient to synthesize: ANCOVA F = 5.37, p = 0.02) (Liu, 2022).
Three additional studies compared the severity of anxiety symptoms among participants receiving mobile apps to those receiving active controls (Reid, 2011;Teng, 2019;Torok, 2022), but the clinical heterogeneity arising from the variability of mental health therapy designs prevented the pooling of results.The mobile apps used different techniques to manage anxiety symptoms, including self-monitoring of mental health symptoms (Reid, 2011), cognitive bias modification or positive stimulation (Teng, 2019), or DBT (Torok, 2022).
One study of an attention bias modification mobile app showed a statistically significant improvement in trait (Evidence not sufficient to synthesize: F 8, 316 = 2.06; p < 0.05) but not state anxiety (p = 0.59) (Teng, 2019).The remaining interventions showed no statistically significant differences in anxiety symptoms at follow-up between the intervention and the control groups (Reid, 2011;Torok, 2022).
Three additional studies compared perceived psychological stress among participants receiving mobile apps to those receiving withheld controls (Hides, 2019;Kageyama, 2021;Kenny, 2020), but the clinical heterogeneity arising from the variability of mental health therapy designs prevented pooling their results.The mobile apps used different techniques to manage stress, including music therapy (Hides, 2019), self-monitoring of mental health symptoms and emotions (Kenny, 2020), or positive stimulation (Kageyama, 2021).
Studies showed no statistically significant differences in psychological stress at follow-up between the intervention and the control groups (Hides, 2019;Kageyama, 2021;Kenny, 2020).Mobile apps compared to active controls (i.e., controlled or sham mobile app).One study compared a mindfulness-based mobile app to an active control (Flett, 2019).There were no statistically significant improvements in psychological stress among participants in the intervention group compared to those in the control group (Flett, 2019).
Two additional studies compared perceived psychological stress among participants receiving mobile apps to those receiving active controls (Reid, 2011;Torok, 2022), but the clinical heterogeneity arising from the variability of mental health therapy designs prevented pooling their results.The mobile apps used different techniques to manage stress, including self-monitoring of mental health symptoms and emotions (Reid, 2011), or DBT (Torok, 2022).
Studies showed no statistically significant differences in psychological stress at follow-up between the intervention and the control groups (Reid, 2011;Torok, 2022).

Mean effect of mobile apps on alcohol use
A total of 9 studies reported the impact of mobile apps on alcohol use among youth (Boyle, 2017;Cordova, 2020;Earle, 2018;Gajecki, 2014;Hides, 2018;Huberty, 2019;Kazemi, 2020;O'Donnel, 2019;Thompson, 2020), but clinical heterogeneity arising from mental health therapy design, outcome measurement, and time of follow-up prevented pooling any of the results.The impact of mobile apps on reducing alcohol use is, thus, reported narratively.
Two studies provided youth participants with mobile apps that allowed them to self-monitor blood alcohol levels or risky alcohol use behaviors before promoting behavioral changes (Gajecki, 2014;Thompson, 2020).One study measured alcohol drinking behaviors before engaging in sexual acts and found statistically significant lower odds of drinking alcohol among participants receiving the mobile app compared to those in the control group in the short term (OR = 0.14; 95% CI: 0.03, 0.64; p = 0.01) (Thompson, 2020).The other study measured the number of weekly binge drinking episodes and found no statistically significant differences between participants in the intervention group and those in the control group in the short term (Gajecki, 2014).
Two studies compared a gamified personalized normative feedback mobile app to an attention-controlled app (Earle, 2018) or a web-based version of the intervention (Boyle, 2017).In one study, and relative to an attention-controlled app, pairwise comparisons showed a statistically significant reduction in the number of times that participants had "partied" during the past week in the short term (MD = −0.28;95% CI: −0.47, −0.08; p = 0.005) (Earle, 2018).In the other study, and relative to the web-based version of the intervention, regression results showed a statistically significant effect of intervention assignment favouring the mobile app group on alcohol consumption in the short term (Regression coefficient B = −0.1;SE = 0.03; p < 0.001) (Boyle, 2017).
Two studies of mobile apps that used brief motivational interviewing (IM) as the framework for mental health therapy design compared their impact on problematic alcohol use to withheld controls using the Alcohol Use Disorders Identification Test (AUDIT) (Hides, 2018;Kazemi, 2020).However, clinical heterogeneity arising from time of follow-up prevented pooling results.One study measured the outcome in the short term (6 weeks follow-up) and found reduction in AUDIT scores among participants in the intervention group that was statistically significant (p = 0.01), whereas changes among participants in the control group was not (Kazemi, 2020).The other study measured the outcome in the medium term (6 months follow-up) and found that any reduction in AUDIT scores were not statistically significant (F 1, 182 = 1.16; time × group interaction p = 0.28) (Hides, 2018).
Three additional studies delivered mobile apps that used different therapy designs to manage alcohol use, such as empowerment-based interactive content (Cordova, 2020) the control group that was not statistically significant in the short term (p = 0.14) (Cordova, 2020).Two studies, using two different therapy designs, measured risk single-point alcohol use (i.e., binge drinking) and did not find any statistically significant improvements relating to this outcome in the short term (Huberty, 2019;O'Donnel, 2019).

| Summary of main results
This systematic review synthesizes the effects of 36 randomized controlled trials conducted among youth from 15 countries.We investigated the effectiveness of mobile apps on the management of depressive symptoms, anxiety symptoms, psychological distress, and alcohol use.Our results demonstrate that apps with different therapy designs may improve mental health outcomes, but the evidence base is very uncertain.Meta-analyses showed that apps which used mindfulness or CBT designs reduced symptoms of anxiety and depression.Apps which combined CBT and mindfulness reduced psychological distress.The evidence on the effects on alcohol use were variable and inconsistent.Additional mental health therapy designs which considered empowerment, gamification and motivational strategies showed some promise to reduce alcohol use.Across all trials, only a small number reported effects of clinical importance, with reductions in symptoms reaching a minimal clinically important threshold (Kounali, 2020;Button, 2015;Kroenke, 2001).Despite limited clinical importance, no harms or adverse events associated with app use were identified.
A minimal clinical threshold should be viewed as an estimate of clinical importance, but it does allow us to better compare our results to in-person or virtual mindfulness and CBT interventions.
Without a doubt, traditional talk therapy, mindfulness programs (Maynard, 2017) and CBT therapy (Hofmann, 2012) show more significant effects on clinical outcomes.This contextualization suggests mobile apps have only slight clinical relevance at this time, and may be best viewed as adjunctive therapy while waiting for less accessible, but more effective, talk therapies.Therefore, mobile apps could represent the first step in a stepped-care approach to addressing youth mental health.
A major contribution of this review is the analysis of effectiveness outcomes based on their underlying psychological theory and treatment approach (which we term "mental health therapy design").This approach to meta-analysis is unique in that it contributes a scientific foundation that allows a comparison of mobile apps with other evidence based psychological interventions.Specifically, our analytic approach allows scientists and clinicians to compare delivery mechanisms of CBT and mindfulness.For example, the findings of this review could be compared to in-person CBT or other mindfulness training, or to apps which include coaching or linkage to therapy (Graham, 2020).

| Overall completeness and applicability of evidence
Our systematic review represents an assessment of apps to manage depression, anxiety and alcohol use among youth aged 15-24.However, several limitations of the evidence base limit the applicability of our findings.First, we only included evidence from randomized trials, and it is therefore possible that other evidence pertaining to the effectiveness of mental health apps for youth may have been excluded.Additionally, we only searched for evidence in bibliographic databases and did not search the gray literature.Future reviews should consider additional sources of evidence to provide a more comprehensive assessment of the available evidence.Second, most trials were conducted among small samples of post-secondary school students in high income countries.These cohorts typically represent well-educated and high socioeconomic status populations.Future research should consider out-of-school youth and populations residing in low-and middle-income countries.Third, our interventions of interest were apps designed to help manage existing symptoms of depression and anxiety, but not prevention.As such, we excluded apps which aimed to prevent mental health or otherwise focused on overall wellness.Given youths' avoidance of mental health interventions, this possibly represents a body of interventions that youth may access to indirectly help with mental health concerns.Finally, all studies were short-term and we therefore have no indication of the longterm potential of these apps to improve mental health.Future trials should be conducted with larger and more diverse populations of youth and assess effects at 6 months, 12 months, and beyond.

| Quality of the evidence
The results of this review should be interpreted with caution.Across all outcomes and for all different intervention designs, we assessed a very low certainty of evidence.Future research is likely to change both the size and direction of the estimate of effect for depression, anxiety, psychological distress, and alcohol outcomes.Additionally, estimates are based on meta-analyses using very few studies (n = 2-4 per analysis).These studies had small sample sizes which impacted the precision and power of the estimates.Further, most trials included in this review (26/36) had a high risk of bias, and only two trials were determined to be of low risk using the Cochrane Risk of Bias 2.0 tool.
Collectively, these issues reduce the certainty in the review findings.

| Potential biases in the review process
We conducted a search in 5 bibliographic databases relevant to the health sciences up to July 1, 2022.All citations were screened in teams by two independent screeners from the review team, and one review author (AS) assessed all included studies against inclusion criteria.However, we did not conduct a gray literature search.Therefore, it is possible that additional evidence exists that was not identified for inclusion in our review.We were unable to comment on

1
| PLAIN LANGUAGE SUMMARY 1.1 | Apps that use Cognitive Behavior Therapy (CBT) and mindfulness reduce mental health symptoms among young people theories.Another alternative is mindfulness training.This model F I G U R E 1 Conceptual framework for Cognitive Behavior Therapy (CBT) and mindfulness apps which address common mental health symptoms among youth.

(
via Ovid), PsycINFO (via Ovid), CINAHL (via EBSCO), and CENTRAL (via the Cochrane Library).The search was restricted from January 1, 2008, to July 1, 2022.The 2008 start date was selected to coincide with the release of Google Play and Apple's App Store(Apple, 2008).There were no language restrictions.The search used a combination of indexed terms, free text words, and MeSH headings.Complete search strategies for all databases are included in Supporting Information: Appendix 1.Additionally, we screened the included studies of relevant systematic reviews and the reference lists of included RCTs to identify additional RCTs for inclusion.We also searched clinicaltrials.govand the WHO trial registry for any records of ongoing or published studies not captured by our database search.
developed a standardized data extraction sheet.This extraction framework was piloted with a random sample of n = 3 included records and revised accordingly in order to ensure the validity of the data extraction form.Two reviewers extracted data in duplicate and independently and compared results afterwards.Any discrepancies in data extraction were resolved by discussion or with the help of a third reviewer.Reviewers extracted the following variables: (1) Context of the study: geographical, epidemiological, gender, socio-economic status contextual data; (2) Study methodology: objective, study design, methodological details such as processes for randomization, allocation and blinding, target population, recruitment and sampling procedures, setting, participant eligibility criteria, and participant baseline characteristics; (3) Intervention: name, description, components (e.g., timing, frequency, route of delivery), and details of the comparison intervention; (4) Outcomes: Definitions, instrument and scale interpretation, timing of outcome measures, adverse events, and usability; (5) Results: Participant follow up, binary (dichotomous) data, continuous data, between-group estimates, and qualitative key findings; and (6) Author conclusions, funding and conflict of interest.
2) found statistically significant and potentially clinically important improvements in depressive symptoms favouring the intervention group compared to the control group in the short term (Pooled SMD = −0.40;95% CI: −0.80, 0.01; p = 0.05) (McCloud, 2020; T A B L E 1 (Continued) ments in anxiety symptoms favouring participants in the intervention group compared to those in the control group in the short term (Pooled SMD = −0.35;95% CI: −0.60, −0.09; p = 0.008) (Lee, 2018; Levin, 2020; Orosa-Duarte, 2021) (Figure 10).Pooled results were F I G U R E 3 Risk of bias assessments using RoB 2.0.

A
N A L Y S I S 1.1 Comparison 1: Mean effect of mobile apps on depressive symptoms, Outcome 1: Depression: Mindfulness-based mobile apps versus withheld controls.F I G U R E 5 Analysis 1.1 for the outcome of depression; Mindfulness-based mobile apps versus withheld controls.A N A L Y S I S 1.2 Comparison 1: Mean effect of mobile apps on depressive symptoms, Outcome 2: Depression: Cognitive Behavioral Therapy (CBT)-based mobile apps versus withheld controls.F I G U R E 6 Analysis 1.2 for the outcome of depression; Cognitive Behavioral Therapy (CBT)-based mobile apps versus withheld controls.A N A L Y S I S 1.3 Comparison 1: Mean effect of mobile apps on depressive symptoms, Outcome 3: Depression: Combination therapy (CBT and mindfulness) mobile apps versus withheld controls.F I G U R E 7 Analysis 1.3 for the outcome of depression; Combination therapy (CBT and mindfulness) mobile apps versus withheld controls.A N A L Y S I S 1.4 Comparison 1: Mean effect of mobile apps on depressive symptoms, Outcome 4: Depression: Mindfulness-based mobile apps versus active controls.F I G U R E 8 Analysis 1.4 for the outcome of depression; Mindfulness-based mobile apps versus active controls.A N A L Y S I S 1.5 Comparison 1: Mean effect of mobile apps on depressive symptoms, Outcome 5: Depression: Cognitive Behavioral Therapy (CBT)-based mobile apps versus active controls.

F
I G U R E 9 Analysis 1.5 for the outcome of depression; Cognitive Behavioral Therapy (CBT)-based mobile apps versus active controls.A N A L Y S I S 2.1 Comparison 2: Mean effect of mobile apps on anxiety symptoms, Outcome 1: Anxiety: Mindfulness-based mobile apps versus withheld controls.F I G U E 10 Analysis 2.1 for the outcome of anxiety; Mindfulness-based mobile apps versus withheld controls.A N A L Y S I S 2.2 Comparison 2: Mean effect of mobile apps on anxiety symptoms, Outcome 2: Anxiety: Cognitive Behavioral Therapy (CBT)-based mobile apps versus withheld controls.F I G U R E 11 Analysis 2.2 for the outcome of anxiety; Cognitive Behavioral Therapy (CBT)-based mobile apps versus withheld controls.F I G U R E 12 Sensitivity analysis of: 2.2 Anxiety: Cognitive Behavioral Therapy (CBT)-based mobile apps versus withheld controls).A N A L Y S I S 2.3 Comparison 2: Mean effect of mobile apps on anxiety symptoms, Outcome 3: Anxiety: Combination therapy (CBT and mindfulness) mobile apps versus withheld controls.

A
N A L Y S I S 2.4 Comparison 2: Mean effect of mobile apps on anxiety symptoms, Outcome 4: Anxiety: Mindfulness-based mobile apps versus active controls.F I G U R E 14 Analysis 2.4 for the outcome of anxiety; Mindfulness-based mobile apps versus active controls.A N A L Y S I S 2.5 Comparison 2: Mean effect of mobile apps on anxiety symptoms, Outcome 5: Anxiety: Cognitive Behavioral Therapy (CBT)-based mobile apps versus active controls.F I G U R E 15 Analysis 2.5 for the outcome of anxiety; Cognitive Behavioral Therapy (CBT)-based apps versus active controls.

N
A L Y S I S 3.1 Comparison 3: Mean effect of mobile apps on psychological stress, Outcome 1: Psychological stress: Mindfulness-based mobile apps versus withheld controls.F I G U R E 16 Analysis 3.1 for the outcome of psychological stress; Mindfulness-based mobile apps versus withheld controls.
, ecological momentary intervention (EMI) (O'Donnel, 2019), and mindfulness (Huberty, 2019).One study measured past 30-day alcohol use and found improvement favouring the intervention group compared to F I G U R E 17 Sensitivity analysis of: 3.1 Psychological stress: Mindfulness-based mobile apps versus withheld controls).